Polygenic trait determinants: maize dwarf mosaic virus

ABSTRACT

A set of nucleic acid probes useful for tracking Maize Dwarf Mosaic Virus Resistance (MDMV), a polygenic trait, is provided. Chromosome segments are identified enabling isolation of the genes governing this trait. 
     A general method for identifying probes useful for tracking and introgressing polygenic traits into elite genomes and identifying chromosome segments governing the traits is also provided. This method involves the analysis of RFLP polymorphisms between parent donor and recipient genotypes and observed phenotypic data using multiple regression by leaps and bounds (&#34;leaps&#34;), followed by standard multiple regression applied to the &#34;leaps&#34; data. The &#34;leaps&#34; data may be used to identify flanking markers, and epistasis may be determined by analysis of the multiple regression data. 
     Kits comprising a subset of the most useful probes (those most closely linked to the trait of interest) are provided. The kits may also comprise flanking probes. Flanking probes used in combination with the most closely linked probes are useful in identifying situations in which donor DNA tends to move in clumps and recovering rare individuals in which traits of interest have separated from surrounding donor DNA, so that elite recipient DNA may be maximized.

This is a divisional of application Ser. No. 07/126,767, filed Nov. 30,1987 abandoned now.

FIELD OF THE INVENTION

This invention lies in the field of genetic engineering usingrecombinant nucleic acid markers, and specifically in the field of plantbreeding.

BACKGROUND OF THE INVENTION

Genetic linkage has been studied and linkage maps have been developedfor a wide variety of species, including plant species. Localization ofgenes of interest can be accomplished through linkage analysis withmapped markers as described by Patterson, E. B. (1982) "The mapping ofgenes by the use of chromosomal aberrations and multiple marker stocks",pp. 85-88, In: Maize for Biological Research (W. F. Sheridan, ed.)University Press, University of North Dakota, incorporated herein byreference.

The concept of using markers associated with favorable agronomic traitsto track and recover the favorable traits in segregating populations isknown to the art, e.g. Atkins et al. (1942), "The isolation of isogeniclines as a means of measuring the effects of awns and other charactersin small grains," J. Amer Soc Agron 34:667-668; Everson, et al. (1955),"The genetics of yield differences associated with awn barbing in thebarley hybrid (Lion×Atlas)×Atlas," Agron. J. 47:276-280; Carol Rivin etal. (1983) "Evaluation of Genomic Variability at the Nucleic AcidLevel," Plant Mol. Biol. Reporter Vol. 1, p. 9; Helentjaris, T. G., PCTApplication published Dec. 6, 1984, "Process for genetic mapping andcross-breeding thereon for plants".

Such genetic linkage has been invaluable in the introgression ofspecific chromosomes or chromosome segments into various geneticbackgrounds (Rick, C. M. and Khush, G. S. (1969) "Cytogenic explorationsin the maize genome", pp. 45-68, In: Genetics Lectures Vol. I (R.Bogart, ed.), Oregon State University Press, Corvalis; and C. Rhyne(1960) "Linkage studies in Gossypium II altered recombination values inlinkage group of allotetraploid G. hirsutum L. as a result oftransferred diploid species genes" Genetics 45:673-683). The use ofgenetic markers speeds the transfer of a specific locus to a desirablegenotype. In plant breeding, tissue of young plants can be tested forthe presence of marker alleles linked to the desirable trait and onlyindividuals displaying the presence of such marker alleles need be grownto adulthood, transplanted and used to produce progeny, thus eliminatingmany time-consuming steps required in traditional plant breeding. Forexample, the tomato nematode resistance gene, mi has been successfullytransferred though linkage with an acid phosphatase isozyme marker(Tanksley, S. D. et al. "Use of an Acid Phosphatase Isozyme forPredictive Association with an Agronomic Trait," Plant Mol. Biol. Rep.,In press). Such markers are also useful in facilitating the recovery ofa desired recurrent parent in a backcrossing program (e.g. S. D.Tanksley, H. Medina-Filho and C. M. Rick (1981) "The effect of isozymeselection on metric characters in an interspecific backcross oftomato-basis of an early screening procedure" Theor. Appl. Genet.60:291-296).

Molecular markers such as isoenzyme, protein and nucleic acid markers,the variants of which do not often have any noticeable effect onphenotype are preferred over the phenotypic markers used in classicalbreeding methods. See Newton, K. J. et al. (1980) "Genetic basis of themajor malate dehydrogenase isozymes in maize," Genetics 95:424-442;Goodman, M. M. et al. "Maize", Isozymes in Plant Genetics and Breeding,Part B (Tanksley, S. D. et al. eds.) (1983) Elsevier Science Publishers.

Nucleic acid markers provide certain advantages over isozyme and proteinmarkers. With DNA markers, allelic variation is detected by firstdigesting DNA from the individuals being analyzed with a variety ofrestriction endonucleases. The resulting fragments are separated byelectrophoresis and transferred to solid support matrices. Allelicfragments are then identified by hybridizing the DNA on the supports tocloned, radioactively-labelled, homologous sequences. Genetic variationdetected in this manner has often been referred to as restrictionfragment length polymorphism (RFLP). The number of RFLP's are virtuallyunlimited. They are unlikely to have an effect on phenotype, arecodominant and are inherited in a predictable fashion.

A theoretical discussion applying known methods of genetic mapping toRFLP's and practical applications thereof is given in Beckmann, J. S.and Soller, M. (1983), "Restriction fragment length polymorphisms ingenetic improvement: methodologies, mapping and costs", Theor. and Appl.Genetics 67:35-43; and Soller, M. and Beckmann, J. S. (1983), "Geneticpolymorphism in varietal identification and genetic improvement," Theor.and Appl. Genetics 67:25-33, both of which are incorporated herein byreference. See also Burr, B., Evola, S. D., Burr, F. A. and Beckmann, J.S. (1983), "The application of restriction fragment length polymorphismsto plant breeding", Genetic Engineering Principles and Methods, (Setlowand Hollander, eds.) Vol. 5:45-49, also incorporated herein byreference, and Ellis, T. H. N. (1986) "Restriction Fragment LengthPolymorphism Markers in Relation to Quantitative Characters", Theor.Appl. Genet. 72:1-2. The usefulness of RFLP mapping for maize also hasbeen discussed by S. V. Evola et al. (1986) "The suitability ofrestriction fragment length polymorphisms as genetic markers in maize",Theor. Appl. Genet. 71:765-771. No specific map positions for any DNAprobes are discussed in any of the above articles.

Map positions for many cloned DNA sequences have been reported inconnection with maize (Zea mays) Helentjaris, T. et al. (1986) "Use ofmonosomics to map cloned DNA fragments in maize", Proc. Natl. Acad. Sci.USA 83:6035-6039. This article reports the identification of 112 lociusing RFLP's. The fragments mapped by Helentjaris et al. are definedrelative to their relationship to certain previously-mapped markers, andrelative to each other. This article is incorporated herein byreference. Other mapping efforts are currently in progress throughoutthe industry and the maize genome is rapidly becoming saturated withmapped molecular markers which are freely available to the public.

While nucleic acid (RFLP) markers have been used to locate andmanipulate traits determined by single genes, they have not beensuccessfully used to locate and manipulate traits determined by morethan one gene. Burr, B. and Burr, F. A. (1985), "Toward a MolecularCharacterization of Multiple Factor Inheritance," Biotech. in Plant Sci.(Zaitlin, M. et al. eds.) discusses this concept in general with respectto quantitative traits without providing specific enablement. Landry, B.S. and Michelmore, R. W. (1985), "Methods and Applications ofRestriction Fragment Length Polymorphism Analysis to Plants," TailoringGenes for Crop Improvement (Bruening G., et al. eds.) 25-44 is a generalreview article containing a section discussing the use of molecularmarkers to track and manipulate quantitative trait loci, but withoutproviding enabling disclosure.

A disadvantage in the use of molecular markers for tracking and breedingtraits is the fact that cross-overs occurring in progeny predictablywill separate the trait of interest from the linked marker used to trackit in a certain percentage of individuals. Nuinhaus, J. et al. (1987),"Restriction Fragment Length Polymorphism Analysis of Loci Associatedwith Insect Resistance in Tomato," Crop Sci. 27:797-803.

Another disadvantage of prior methods for tracking traits usingmolecular markers is the fact that a particular linked marker allele maynot invariably correlate with the presence of the phenotype beingstudied. Many phenotypes are developmentally expressed, and unless thepopulations are scored at multiple times during their life cycles,important associated marker alleles can fail to be identified.

Helentjaris, T. (1987), "A genetic linkage map for maize based onRFLPs," Trends in Genetics 3:217-221 provides a maize linkage map andseveral loci for plant height determinants with the relativecontribution of each loci to the phenotype indicated. No enabling methodfor determining such loci is provided, however. Edwards, M. D., et al.(1987), "Molecular-Marker-Facilitated Investigations ofQuantitative-Trait Loci in Maize. I. Numbers, Genomic Distribution andTypes of Gene Action," Genetics 116:113-125, provide a method forlocating quantitative trait loci using molecular markers. In thismethod, single-factor analysis is used to determine loci associated witha number of different traits. This analysis was followed by a multipleregression method to determine the relative contribution of each suchlocus to the given trait. This method, while identifying locidetermining polygenic traits and the relative contribution of each, hasthe drawback of failing to provide a method for ensuring against loss ofthe trait being tracked due to cross-over in progeny populations. Themethod described above also fails to take into account the possibilityof developmentally-expressed phenotypes.

Nienhuis, J. et al. (1987), "Restriction Fragment Length PolymorphismAnalysis of Loci Associated with Insect Resistance in Tomato," Crop Sci.27:797-803 discloses the use of RFLP technology to identify quantitativetrait loci affecting expression of insect resistance in a wild tomatospecies. Conventional linkage analysis was used to locate RFLP lociassociated with the trait, followed by linear and multiple regression todetermine the relative contribution of each locus. Analysis of theresidual plots indicated that one or more additional loci with majoreffects had not been identified. The article suggests the use offlanking markers to localize a target quantitative trait locus, butcharacterizes this as "problematic."

No previously described method for locating DNA governing polygenictraits has been successfully used to introgress such traits into asecond or elite genotype.

The present application provides a method for tracking and manipulatingpolygenic traits in a breeding program which solves the problem of lossof the trait due to cross-over in the progeny population. This methodinvolves the analysis of molecular marker linkage data for apredetermined polygenic trait by the method of multiple regression byleaps and bounds (Furnival, G. M. and Wilson, Jr., R. W. (1974)"Regression by leaps and bounds," Technometrics 16:499-511). This methodwas developed to assess the relative contributions of causative factorson effects, (i.e. numerous independent factors on dependent variables),and has not previously been applied to genetic analysis, possiblybecause of lack of appreciation by those skilled in the art of thepossibility of making an analogy between such classical causativefactors and marker alleles.

The method of the present application also ensures that marker allelescorresponding to developmentally expressed phenotypes are identified.

The method of the present application is exemplified by theidentification of loci determining maize dwarf mosaic virus (MDMV)resistance in maize. Maize dwarf mosaic virus occurs throughout theUnited States and Europe. Resistant cultivars of dent corn have beendeveloped, but sufficient genetic loci determining such resistance toenable introgression of the trait into a variety lacking such resistancehave not been previously identified. In an abstract for a presentationto the 78th Annual Meeting of the American Society of Agronomy at NewOrleans, Louisiana Nov. 30 through Dec. 5, 1986, G. E. Scott reports thelinkage of MDMV resistance to endosperm color in corn, concluding thatone or more genes for resistance must be located on the long arm ofchromosome 6. The abstract does not provide an enabling disclosure norlocate the gene or genes with sufficient exactitude to enable theirisolation. Resistant cultivars of sweet corn having quality factorsacceptable to the industry have not been developed, leading to seriouseconomic losses in the United States due to MDMV. Use of identified locifor MDMV resistance is thus useful for producing inbred cultivars ofresistant sweet corn.

Inheritance of resistance to MDMV is not clearly understood. The numberof genes which contribute to resistance and the nature of gene actionappears to be significantly dependent upon the source of MDMVresistance, the susceptible inbreds, the time of scoring, and the methodof inoculum production and application. (Louie, R. (1986), "Effects ofgenotype and inoculation protocols on resistance evaluation of maize tomaize dwarf mosaic virus strains," Phytopathology 76:769-773 .

Roane et al. (1983), "Inheritance of resistance to maize dwarf mosaicvirus in maize inbred line Oh7B," Phytopathology 73:845-850, reportedthat in crosses between the resistant line Oh47b and two susceptiblelines, Oh43 and Pa91, the inheritance of resistance was conditioned byone dominant gene. Rosenkranz, E. and Scott, G. E. (1984),"Determination of the number of genes for resistance to maize dwarfmosaic virus strain A in five corn inbred lines," Phytopathology74:71-76, showed that the inbreds Ga203, Ar254, and Pa405 appear to havethree, two and five additive resistance genes respectively. Crosses inwhich the resistant lines B68 or Pa405 were the donors, and susceptiblesweet corns were the recipients revealed three genes, one of which mustbe present with the other two (Mikel, M. A. et al. (1984), "Genetics ofresistance of two dent corn inbreds to maize dwarf mosaic virus andtransfer of resistance into sweet corn," Phytopathology 74:467.

The difficulty of assessing genotype from phenotype, and the existenceof as many as five significant genes make MDMV resistance an idealproblem for the application of RFLP technology. A further difficulty isprovided by the fact that genomic material of resistant MDMV inbredlines tends to move in large segments. This makes it difficult tomaximize the presence of genes governing the desired trait from thedonor parent while minimizing the presence of surrounding, lessdesirable DNA. This problem is not specific to MDMV, but is a commonproblem which is difficult to identify and deal with not only in maizebut in the selective breeding of other species as well. The presentinvention involves the identification of chromosome regions which areassociated with MDMV resistance, the prediction of which progeny in anadvanced generation will be resistant and which not, and the assessmentof recovery of the elite genotype. Rates of convergence upon the desiredgenotype are significantly increased while risk of losing essentialmarker loci is substantially reduced.

SUMMARY OF THE INVENTION

A set of primary probes or clones are provided linked with genesdetermining maize dwarf mosaic virus resistance or susceptibility. Inthe preferred embodiment, the probes are DNA probes having sequenceshybridizable to portions of the maize genome close to (having at mostabout 10% recombination) with the genes of interest. These preferredclones are designated r179, c587, c512, c926, c329, gp144, r262 and r92.A library containing these probes in plasmids is on deposit according toBudapest Treaty requirements at the In Vitro International Depository of611P Hammonds Ferry Road, Linthicum, Maryland 21090 deposited Nov. 30,1987, entitled "Corn (Zea mays) Nuclear DNA Clones," under Accession No.IVI-10150.

A further set of flanking probes are provided to enable detection of asegment of genomic DNA known to contain the gene governing MDMVresistance. When an individual shows marker alleles corresponding to theparent donating the trait at both the locus of the primary probes andthe flanking probes, it is known that the individual has the gene inquestion since the marker probe is selected such that the gene liesbetween the primary and the flanking loci or between two flanking locion either side of the gene. When an individual shows marker allelescorresponding to the parent donating the trait at the locus of theprimary probes and not the flanking probes, and still shows thephenotype associated with the locus, it is known that the individual hasthe desired gene, with minimal extraneous DNA from the donor parent. Useof these flanking probes enables the breeder to detect situations inwhich genomic material from the donor parent is moving in largesegments, to identify the rare occurrence of individuals in which suchlarge segments have not been transferred, and to maximize the presenceof the elite DNA from the recipient parent.

A "flanking locus" as used herein, means a locus determined by thestatistical methods described herein to have the second largestcontribution to phenotypic variability among a set of linked probes. The"primary locus" is the locus having the largest contribution of the setof linked probes.

The flanking probes are designated r250, r271, gp53, gp52, r189, r21 andc595. These probes are on deposit with In Vitro International as part ofthe clone library referred to above.

The terms "clone" and "probe" are used interchangeably herein to referto a nucleic acid fragment containing a sequence which is substantiallyhomologous (preferably at least about 85% homologous) to a genomic DNAsequence and capable of hybridizing to a said genomic DNA sequence. A"clone" or "probe" may contain more or less nucleic acid than therestriction fragment to which it hybridizes. "Clone" or "probe" as usedherein may refer to a linearized plasmid containing the nucleic acidfragment corresponding to a genomic DNA sequence, or to a fragmentincluding extraneous sequences, such as tails and vector sequences, solong as it hybridizes to the genomic DNA.

The terms "trait", "characteristic", and "phenotype" are usedinterchangeably herein. A "trait" can be a classical phenotype such asthe maize phenotypes, maize dwarf mosaic virus (MDMV) resistance,japonica, crinkly leaves, dwarf plant, etc., an enzymatic factor, or thecharacteristic of showing a particular restriction fragment lengthpolymorphism when the DNA is digested with a particular restrictionenzyme and probed with a particular clone. The latter is sometimesspecifically referred to as a "marker allele."

The term "marker" refers to a genetic element (DNA governing a trait)which has been mapped, or for which recombination frequencies with othergenetic elements have been determined. A "marker" can be any trait whoserelationships with other markers are known. Isozyme markers know to theart such as idh2, enp1, and mdh1 are useful in the practice of thisinvention. "Marker clones" or "DNA, RNA or RFLP markers" are clones ofthis invention or a nucleic acid fragment whose loci on chromosomes orlinkage groups have previously been determined.

A "locus" is a site on the genome corresponding to an observable trait.In the case of an RFLP trait, the locus (or loci) are DNA sequenceswhich hybridize to a particular clone or probe.

"MDMV resistance" defining a trait is used to mean both MDMV resistanceand MDMV susceptibility since the trait itself includes both ends of thespectrum. The statistical methods described herein refer to a scoringmethod for this trait in which higher numbers indicate susceptibility,or observable presence of the disease, and lower numbers indicateresistance, or relative absence of the disease.

DNA fragments comprising DNA sequences governing MDMV resistance arealso provided. These fragments may be isolated and sequenced by meansknown to the art, and are the segments of the genome falling betweenflanking and primary markers or between flanking markers. For purposesof this invention, it is not necessary to identify the chromosome onwhich each segment occurs, however, this information is provided as amatter of general information. The numbers in parentheses below refer tomap distances between the markers, or more accurately, recombinationfrequencies between the markers. These numbers may vary from cultivar tocultivar, and are not part of the essential definition of the DNAfragments. The DNA fragments of this invention are:

Chromosome 1: c587 (15.4) c512 (3.8) r250. Alternatively, only thesegment c512-r250 may be used.

Chromosome 3: r179 (8.7) r271.

Chromosome 5: c926 (5.4) gp53.

Chromosome 5: c329 (9.8) gp52.

Chromosome 6: gp144 (10.4) r189.

Chromosome 8: r262 (11.1) r21.

Chromosome 9: c595 (1.6) r92. Alternatively, the fragment on Chromosome9 may defined as the segment of Chromosome 9 lying between markers oneither side of c595 and having a percent recombination rate with c595 ofno more than about ten.

The probes and DNA fragments of this invention may be used to developadditional or substitute probes mapping to the same or contiguousregions. For example, any other phage or plasmid clone (or subclonethereof) which hybridizes to a clone of this invention is a substituteclone. Nucleic acid hybridization conditions may be employed by thoseskilled in the art utilizing well-known, published equations, forexample as described in Nucleic Acid Hybridization: A PracticalApproach, (Hames, B. D. and Higgins, S. J., eds.) (1985), IRL Press,Oxford. To maximize accuracy of results, it is preferred that thehybridization stringency be such that sequences which are less thanabout 85% homologous will not hybridize. Any new probe or DNA fragmentwhich is identified using a probe or fragment of this invention is anequivalent to the probe or fragment of this invention.

Both DNA and RNA versions of the probes and fragments are covered bythis invention. RNA probes and fragments may be transcribed orsynthesized using means known to the art once DNA versions of the probesand fragments have been developed.

Equivalent probes or markers may be used to define chromosome segmentscomprising DNA governing MDMV resistance, and chromosome segments sodefined are equivalent to the chromosome segments defined by the probesnamed herein and are within the scope of this invention.

The probes may be usefully combined into kits useful to plantgeneticists for manipulating the MDMV resistance trait. An essentialprobe is r179. This probe is essential for the expression of resistance(i.e., it is epistatic to each of the following probes. The genomic DNAfragment, r179-r271 contains the actual gene governing the trait at thislocus. The kit therefore should contain probes r179 and flanking prober271.

A kit additionally comprising the primary probe gp144 with or withoutits associated flanking marker, r189, defining DNA segment gp144-r189will be useful to account for about 37-41% of the phenotypicvariability, provided that the B68 alleles of r179 alone or incombination with its flanking marker r271 are present.

The further addition of primary probe c512, with or without itsassociated flanking probe r250, defining DNA segment c512-r250, or thesecond linked probe, c587, defining DNA segment c587-r250, will accountfor up to about 79-84% of the phenotypic variability, provided that theB68 alleles of r179 alone or in combination with its flanking markerr271 are present and the B68 alleles for gp144 alone or in combinationwith its flanking marker r189 are present.

As the remaining probes are added, each will contribute an approximatelyequal further degree of predictability. These remaining probes, whichmay be added individually or separately, are c926, with or without itsassociated flanking probe, gp53, defining DNA segment c926-gp53; c329,with or without its associated flanking probe, gp52, defining DNAsegment c329-gp52; r262 with or without its associated flanking probe,r21, defining DNA segment r262-r21; and r92, with or without itsassociated flanking probe, c595, defining DNA segment c595-r92.

The probe r92 has two loci on the maize genome, on chromosome 1 andchromosome 9. To ensure that the correct locus is identified, the bandsize associated with r92 may be ascertained by determining linkage withc595, and the appropriate band size followed, as known to the art.

The methods described herein may be used to locate additional probes atadditional loci with lesser contributions to the phenotype in thecultivars studied, or with greater or lesser contributions in othercultivars. Kits comprising such additional probes, alone or incombination with the probes described herein, are included within thescope of this invention. Preferably a kit for a given set of cultivarscontains the primary and more preferably also the flanking probesassociated with loci having the most effect on the phenotype. Additionalprobes for loci having lesser effect on the phenotype may be added aseconomic feasibility dictates.

A generalized method for identifying a heritable association betweennucleic acid marker probes and a polygenic phenotype not limited tomaize is provided. A "polygenic" trait is a trait controlled by multiplegenetic loci. Preferably, at least about 80% of the trait is governed byno more than about four loci, as the fewer loci required to manipulatethe trait in a breeding program, the more convenient and economicallyfeasible such manipulation will be. Quantitative traits such as heightand yield are often polygenic traits, but are not necessarily so. Thepreferred embodiment for this method exemplified herein involves maize.This method comprises:

(a) Analyzing DNA from a first parent having said phenotype and a secondparent not having said phenotype by RFLP analysis to determine a set ofsufficient nucleic acid marker probes which show different RFLP markeralleles in the two parents to cover a significant portion of the genomeof the species. Preferably, probes are selected from a previously mappedgenome at evenly spaced intervals along the genome, preferably at leastone probe per chromosome or chromosome arm is selected, and morepreferably, probes are selected at more or less regular intervalspreferably of about 10 to about 20 map units. Markers other than RFLPprobes may be used in this analysis, however, RFLP probes are preferred.As discussed above, the maize genome has been mapped with publiclyavailable clones and other markers which may be used for this purpose.It is not necessary, however, that the genome be mapped or locations ofthe probes be previously selected. It is possible to develop a set ofrandom clones, as is known to the art, for use in this invention withoutknowing map locations, chromosome locations, or even how manychromosomes the organism possesses.

As is known to the art, RFLP's may be developed using one or morerestriction enzymes to cut the genomes being studied. Preferably, onlyone restriction enzyme is used. In the preferred embodiment this enzymeis EcoRI.

(b) Crossing said parents to obtain a progeny population of individualswhich are segregating for said phenotype and selecting and scoring astatistically significant number of segregating individuals for thepercent presence of said phenotype. Preferably, both the incidence ofthe phenotype in the population is scored and the severity is rated ineach individual. More preferably, scoring is done at several timesduring the life cycle of the individuals so that developmentallyoccurring phenotypes can be associated with marker alleles.

(c) Analyzing DNA from said selected individuals to determine whichparental marker alleles are present in each individual. This analysis isdone by means known to the art, and is discussed in more detailhereinafter.

(d) Analyzing the data of steps (b) and (c) by multiple regression byleaps and bounds ("leaps") to determine a subset comprising a minimumnumber of primary marker alleles, preferably RFLP marker alleles,correlated with a maximum percent presence of said phenotype. Thismethod is known to the art as described above, but has not previouslybeen applied to genetic analysis. Preferably, phenotype severity data isincluded in this analysis as well, and more preferably, data fromseveral ratings for each individual taken at two or more times in thelife cycle of the individual are also used. Preferably, the datagenerated in this analysis are further analyzed to determine flankingmarkers, by examining the successive sets of marker loci chosen by"leaps" for those associated with the trait at each locus, but not asclosely as the primary alleles. The "leaps" analysis will confirm thatthe trait is, in fact, polygenic.

The method preferably continues with an analysis of said subset bymultiple regression, a method known to the art, to determine therelative contribution of each primary marker allele to the phenotype.This is important to the accuracy of the predictive value of the locideveloped. For example, in the preferred embodiment described herein,several loci which were consistently picked by the "leaps" analysis didnot contribute as highly to the trait as the loci defined by the claimedprobes.

The multiple regression analysis determines what percent of the traithas been accounted for by the identified loci. The method also makes itpossible to rank loci according to their contribution to the presence ofthe trait. It is desirable for efficiency of use in breeding that aminimum number of loci having a maximum effect on the trait beidentified and used.

In addition, the multiple regression data makes it possible to determineepistatic effects of particular loci by preparing a normal quantilequantile plot of the multiple regression data. If the graph of observeddeviation of the data from the straight line assumed by the methoditself deviates from a straight line, indicating that the trait isactually more pronounced or severe than predicted at the high end andless pronounced or severe than predicted at the low end, epistasis isindicated. Graphing of the multiple regression data visuallydemonstrates such epistasis. In the preferred embodiment describedbelow, for example, the r179 locus was shown to be epistatic to otherloci, e.g. those at c512 and gp144.

The loci determined by the above method need not be located on achromosome map of the species being tested, but are preferably solocated to facilitate selection and use of equivalent probes andchromosome segments.

As will be appreciated by those skilled in the art, the method may beapplied using additional primary and flanking markers to maximizeassociation of the markers with the trait and determine the exactlocation of the genes governing the trait with sufficient accuracy toenable their isolation and sequencing.

The use of the RFLP probes described and claimed herein as linked withMDMV resistance enables the identification of loci governing MDMVresistance in any maize genome including both sweet and field cornvarieties. The primary probes r179, gp144, and c512 are the most useful,although all the probes described above may be profitably used for thispurpose.

The method, as applied to MDMV resistance in maize is useful formanipulation of the trait in sweet corn, for which no economicallyvaluable resistant cultivars have previously been developed.

A method is also provided for transferring a desired polygenicphenotype, preferably MDMV resistance in maize from a donor genotype,preferably an MDMV resistant maize cultivar such as B68 into a recipientgenotype, preferably an elite maize cultivar such as B73, comprising:

(a) determining the marker allele profiles of said donor and recipientgenotypes having marker alleles substantially evenly distributedthroughout the genome of said genotypes, as discussed above;

(b) identifying a minimum number of primary markers, preferably nucleicacid marker probes, showing marker alleles corresponding to a maximumpresence of said phenotype in a progeny population obtained fromcrossing said donor and recipient genotypes by multiple regression byleaps and bounds and selecting a useful subset of those having themaximum individual contribution to aid presence of said phenotype bymultiple regression, all as discussed above. Preferably, not only thepresence of said phenotype in said population is correlated with markeralleles, but also the severity of the phenotype is rated, and preferablyat different times during the life cycle of individuals being rated, andall rated factors are considered in a single factor whose correspondencewith the RFLP marker alleles is determined. Preferably, flanking markersare also determined as discussed above.

(c) backcrossing individuals from said progeny population having markeralleles corresponding to said desired phenotype and otherwise having amaximum number of said useful subset of marker alleles of step (b)corresponding to said recipient genotype with parents of the recipientgenotype to produce a first backcross population;

(d) backcrossing individuals from said first backcross population havingmarker alleles corresponding to said desired phenotype and otherwisehaving a maximum number of said useful subset of marker alleles of step(b) corresponding to said recipient genotype with parents of therecipient genotype to produce second and subsequent backcrosspopulations until a last population having the desired similarity to therecipient genotype is achieved;

(e) selfing individuals of said last population and identifying thosehaving marker alleles homozygous for said desired phenotype.

Preferably selection of individuals for crossing and, backcrossing isdone by RFLP analysis in which both primary and flanking nucleic acidprobes are used to identify and select individuals having the markeralleles shown by said probes corresponding to the donor phenotype.Individuals having said primary marker alleles corresponding to saiddonor genotype but having flanking marker alleles corresponding to saidrecipient genotype are tested for said phenotype by observation andindividuals exhibiting the desired phenotype are selected as havingmaximum recipient DNA and minimal donor DNA other than DNA determiningthe desired phenotype. This method is especially valuable in cases whereDNA from the donor genotype tends to move in larger than normalsegments, as occurs with B68, a donor for MDMV resistance. Individualshaving primary marker alleles corresponding to the donor genotype andflanking marker alleles corresponding to the recipient genotype are muchmore rare than classical Mendelian segregation would predict whensegments of the donor genome tend to move in clumps. Identification ofsuch rare genotypes prior to breeding in a greenhouse setting willgreatly facilitate the breeding process.

BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1-11 are bar charts showing the effect of marker loci on MDMVresistance. B68 alleles are alleles from the MDMV resistantdonor-parent.

FIG. 1 is a bar chart comparing the effects of the number of B68 (MDMVresistant) alleles at marker loci r179 and gp144 on MDMV incidence toillustrate interaction between said loci.

FIG. 2 is a bar chart comparing the effects of the number of B68 (MDMVresistant) alleles at marker loci r179 and c926 on MDMV incidence toillustrate interaction between said loci.

FIG. 3 is a bar chart comparing the effects of the number of B68 (MDMVresistant) alleles at marker loci r179 and c329 on MDMV incidence toillustrate interaction between said loci.

FIG. 4 is a bar chart comparing the effects of the number of B68 (MDMVresistant) alleles at marker loci r179 and c512 on MDMV incidence toillustrate interaction between said loci.

FIG. 5 is a bar chart comparing the effects of the number of B68 (MDMVresistant) alleles at marker loci r179 and c262 on MDMV incidence toillustrate interaction between said loci.

FIG. 6 is a bar chart comparing the effects of the number of B68 (MDMVresistant) alleles at marker loci r179 and c587 on MDMV incidence toillustrate interaction between said loci.

FIG. 7 is a bar chart comparing the effects of the number of B68 (MDMVresistant) alleles at marker loci r179 and r92b on MDMV incidence toillustrate interaction between said loci.

In FIGS. 8-11, "4 B68 alleles" means the loci at both ends of thesegment are homozygous for B68. "2 B68 alleles" means both loci definingthe segment are heterozygous. "0 B68 alleles" means both the loci atboth ends of the segment are homozygous for B73 alleles.

FIG. 8 is a bar chart comparing the effects on MDMV incidence of thenumber of B68 (MDMV resistant) alleles in chromosome segments A-A1defined by marker loci r179 and r271 and B-B1 defined by marker locigp144 and r189 to illustrate interaction between said segments.

FIG. 9 is a bar chart comparing the effects on MDMV incidence×severityof the number of B68 (MDMV resistant) alleles in chromosome segmentsA-A1 defined by marker loci r179 and r271 and B-B1 defined by markerloci gp144 and r189 to illustrate interaction between said segments.

FIG. 10 is a bar chart comparing the effects on MDMV incidence of thenumber of B68 (MDMV resistant) alleles in chromosome segments A-A1defined by marker loci r179 and r271 and C-C1 defined by marker locic512 and r250 to illustrate interaction between said segments.

FIG. 11 is a bar chart comparing the effects on MDMV incidence×severityof the number of B68 (MDMV resistant) alleles in chromosome segmentsA-A1 defined by marker loci r179 and r271 and C-C1 defined by markerloci c512 and r250 to illustrate interaction between said segments.

FIG. 12 is a bar chart comparing the effects on MDMV incidence of thenumber of B68 (MDMV resistant) alleles in chromosome segments B-B1defined by marker gp144 and r189 and C-C1 defined by marker loci c512and r250 when segment A-A1 defined by marker loci r179 and r271 ishomozygous for B68 alleles to illustrate interaction between saidsegments.

FIG. 13 is a bar chart comparing the effects on MDMV incidence×severityof the number of B68 (MDMV resistant) alleles in the chromosome segmentsB-B1 defined by marker loci gp144 and r189 and C-C1 defined by markerloci c512 and r250 on MDMV resistance when chromosome segment A-A1defined by r179 and r271 is homozygous for B68 to illustrate interactionbetween said segments.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

As is known to the art, DNA restriction fragment length polymorphisms(RFLP's) may be used to reveal differences in DNA taken from differentorganisms.

DNA is isolated from an organism and digested with a restriction enzymeby methods known in the art. A particular restriction enzyme cleaves theDNA only at sites containing a specific nucleotide sequence, e.g. therestriction enzyme EcoRI cuts double stranded DNA only in the sequencesGAATTC. Each restriction enzyme will cleave the DNA of a particularorganism into a particular pattern of fragments with differing lengthsas specified by the distances between restriction enzyme recognitionsites. Single site mutagenesis or DNA rearrangement such as insertionand deletion can alter the distance between restriction enzymerecognition sites in different genotypes. The different lengths ofparticular fragments distinguish genotypes and varieties. Each genotypeor variety will exhibit a particular pattern or "fingerprint" ofdifferent sized fragments when probed with the same set of clones.Obviously, the more unrelated the genotypes or varieties are, the moredifferences there will be in their "fingerprints". Theoretically,however, even closely related inbreds could be told apart if their DNAwere digested with a sufficient number of restriction enzymes and probedwith a sufficient number of clones.

As is known in the art, DNA fragments may be separated by size using gelelectrophoresis. When it is desired to compare the DNA of two organisms,the DNA samples of each are digested with the same restriction enzymeand the resulting fragments are separated according to size usingelectrophoresis. Many fragments from one genotype may differ in lengthfrom their counterparts in another genotype. Because any specificfragment represents a very small proportion of the total fragments, andcannot be distinguished from them by visual means on the gel, asequence-specific probe which can be easily detected must be used toidentify the specific homologous fragments in each DNA sample and permitcomparison of fragment size.

Probes may be prepared by means known to the art, e.g. by using cDNAfrom RNA transcripts or genomic DNA from the organisms being studied.Plasmids containing cDNA clones used herein were made by 1) isolatingpoly(A) RNA from tissue, such as dark grown coleoptile tissue or roottissue from B73 maize using reverse transcriptase as is known in the artto prepare a double-stranded copy DNA (cDNA) of the RNA. Plasmidscontaining genomic DNA were prepared by digesting B73 inbred maize DNAto completion with the restriction enzyme XhoI or PsfI, and cloned usingestablished methods known to the art. The bacterial plasmid vectorspSP64, pGEM3, pGEM2 and pGEMblue are examples of useful plasmids and areavailable from Promega Biotech, Madison, Wisconsin, and can bemultiplied in suitable hosts. The specific bacterial host used above wasE. coli MC1061.

Bacterial transformants containing the plasmids were screened usingcolony hybridization and DNA dot blot hybridization with radioactivelylabeled chloroplast, mitochondrial and nuclear maize DNA. Any colony orDNA sample which showed strong hybridization to any of the probes wasrejected as containing a sequence which was organelle DNA or was highlyrepeated in the nuclear genome.

All of the above cloning procedures are known to the art. As is known tothe art, a number of suitable plasmid vectors for the insertion of DNAsequences are available which are considered equivalent to those ondeposit when bearing the clones of this invention or clones capable ofhybridizing to the same genomic sequences.

Plasmid DNA isolated from each of the bacterial transformants wasradioactively labeled to provide specific hybridization probes by meansknown to the art. In a preferred embodiment, DNA clones inserted intotranscription vectors (e.g. pSP64, pGEM3 and pGEM2) may be transcribedinto radioactively-labeled RNA probes using SP6 or T7 phage RNApolymerase. Alternatively, the entire plasmid or the isolated insert maybe radioactively labeled by nick-translation using E. coli DNAPolymerase 1. All of these procedures are known in the art. These probeswill hybridize to homologous sequences in any maize genome.

Markers are analyzed for their utility by hybridization to DNA preparedfrom inbred organisms. A donor parent is selected for exhibiting thedesired phenotype, and a recipient parent is selected exhibiting otherdesirable phenotypes.

DNA fragments from the organisms being studied are prepared by digestinggenomic DNA with a restriction enzyme. Any restriction enzyme known tothe art may be used, but enzymes which meet the following criteria arepreferred: 1) inexpensive, 2) reliable (i.e. not subject tomanufacturer's batch to batch variation nor difficult to use ), 4 )produce fragments ranging between about 2 and about 20 kilobase pairs,5) exceptionally good at revealing polymorphism. Examples of preferredrestriction enzymes are EcoRI, DraI, EcoRV, BclI, and BamHI. In thepreferred embodiment described herein, only EcoRI was used.

After electrophoretic size separation, the DNA fragments are transferredfrom the electrophoresis medium (typically an agarose gel) to a solidsupport (e.g. nitrocellulose or nylon membranes) such that the patternresolved on the gel is preserved on the membrane. This membrane isincubated with labelled probe during which time the probe hybridizes tothe specific corresponding DNA sequence. By observing the location ofthe probe on the membrane, it is possible to determine differences, or"length polymorphisms", between homologous restriction fragments.

Probes which meet the following criteria are useful: 1) the probehybridizes to a small number of genomic fragments (preferably less thanthree, and more preferably only one) so that the map position of eachfragment can be determined unambiguously; 2) the probe must revealpolymorphism between the inbred lines that will be used to generate thesegregating population used to map the clones; 3) it is desirable(though not essential) that the probe reveals polymorphism betweenclosely related lines not necessarily pertinent to the immediate task ofidentifying the trait being studied; 4) the probe should producereasonable hybridization signals and not artifactual signals whichimpede its routine use. The above screening procedure may be repeatedusing different restriction enzymes and different probes until a numberof useful clones have been selected.

The above probe screening process establishes "fingerprints" or profilesof RFLP variants or alleles present in each variety. The alleles may bemapped on a chromosome map, but this is not necessary to the practice ofthe invention. As will be understood by those skilled in the art,markers other than RFLP probes may be part of the "fingerprint," e.g ,isozyme markers and phenotypic markers and data with respect to suchmarkers may be substituted for RFLP data in the methods describedherein.

Using established methods of genetic linkage analysis, such as describedin the background section of this application, the segregation dataderived from the above-described crosses may be used to link genesgoverning the trait being studied with particular probes, and also tomap the positions on the maize genome of such probes if desired. Thisinvolves calculating the percent of the progeny in which a clonecosegregates with a known marker. Genetic map distance is defined by thepercent recombination observed between two loci. For example, if a givenclone co-segregated with a previously mapped marker 100% of the timethere would be 0% recombination and thus, the clone would be 0 cM(centiMorgans; map units) from the previously mapped marker. A 10%recombination rate would indicate a 10 cM map distance from the previousmarker, etc. (Sturtevant, A. H. (1913) "The linear arrangement of sixsex-linked factors in Drosophila, as shown by their mode ofassociation", J. Exp. Zool. 14:43).

Recombination data is obtained by examining the progeny of two parentalgenotypes which are distinguishable by RFLP's when probed with eachclone.

If it is desired to map the entire initial set of probes used to studythe trait, or to determine the linkage relationships among them, linkageanalysis using an improved method of orthogonal contrasts based on themethod of Mather, K., "The Measurement of Linkage in Heredity" (1931)Methuen & Co., London, and the method of maximum likelihood (Allard, R.W. (1956) "Formulas & Tables to Facilitate the Calculation ofRecombination Values in Heredity," Hilgardia pp. 235-278) may be used todetermine recombination frequencies between the probe in question and aknown marker or another previously mapped or linked probe. The Mathermethod is expanded to cover the 6- and 9-cell matrices required for theanalysis of co-dominant traits.

If the linkage analysis indicates that an association of a clone withanother marker exists, a test of maximum likelihood is performed, asknown to the art, to estimate the recombination frequency (also called"linkage" or "association") of the two traits or probes. Thisrecombination frequency is designated p by convention. The standarderror of this recombination frequency is also calculated by methodsknown to the art. The value of p is an estimate and because therecombination frequency can be thought of in terms of map units ofseparation, it indicates the most likely distance between two markers.The standard error is symmetrically distributed about the value p andindicates the range within which the true distance between markers isexpected to lie. The stringency of the linkage analysis is such that pvalues will rarely exceed 0.20 (20 map units).

The process is repeated with all selected probes. The association ofeach marker used in a particular cross is compared with each othermarker which can be used to differentiate the parents used in thatcross. In this way one cross generating between preferably about 50 toabout 100 F2 individuals can be used to analyze a large number ofmarkers.

Associated markers are arranged in linear order to form "linkagegroups".

Linkage groups may be assigned to any of the chromosomes of the organismbased on associations of markers in the group to markers previouslymapped to these chromosomes, or by the use of other means known to theart such as analysis of monosomics. This latter method is well known tothe art and is described, e.g., in T. Helentjaris et al. (1986) "Use ofMonosomics to Map Cloned DNA Fragments in Maize", supra, incorporatedherein by reference.

Recombination frequencies are not strictly analogous to physicaldistances since factors other than absolute separation on the chromosomemay determine recombination rates. For this reason, map distancesassigned to each marker are approximations of least inconsistency andrepresent therefore a compromise whereby map distances simplyapproximate recombination frequencies as closely as possible.

As an alternative to the development of a special set of probes coveringthe genome of the organism, such probes previously developed by theprior art may be used. Probes useful for studying traits in maize andtheir map locations are known to the art, as described in the backgroundsection.

It is useful but not necessary to determine the linkage relationshipsbetween the initial set of probes used to determine polymorphismsbetween the parent organisms prior to analyzing them for thecontribution to a particular polygenic trait.

It is preferred that the probes used have only one locus in the genome,however, if probes having more than one locus must be used, they can beidentified by band size which, as known to the art, may be ascertainedby determining the band size linked to a probe also having an effect onthe expression of the trait.

In the preferred embodiment, a trait, preferably one suspected to bepolygenically determined, is selected for study. Parental organisms,preferably inbred lines, exhibiting the trait and not exhibiting thetrait are chosen, and preferably the parent or inbred not exhibiting thetrait is selected for otherwise desirable genomic material. This parentis called the "elite" parent. DNA from the parents are probed with theinitial set of probes to determine which probes show polymorphisms whenthe parental DNA is compared.

Progeny segregating for the trait are selected, and preferablybackcrossed to the elite parental line to maximize elite DNA, and selfedto produce individuals homozygous for the trait in question.

Progeny from the parental cross are analyzed using the initial set ofmarkers to determine marker genotypes, or "fingerprints."

The progeny population resulting from the above-described crosses, andpreferable backcrosses and selfing, is analyzed using the markersshowing polymorphisms between the parents, and in addition is rated forthe presence of the trait being studied. The percent of individualsexhibiting the trait in the population is termed the "incidence" herein.When the trait is one which can be rated quantitatively, such as forseverity or intensity, as is MDMV resistance/susceptibility, it ispreferred that this parameter be rated as well. This parameter is termed"severity" herein. Preferably, severity is rated on a scale yielding nomore than about three or four values, such as the scale of 1 to 4 usedin the preferred embodiment hereof. It is also preferred that severalratings, preferably about three, be taken over the life cycles of theindividuals being rated so as to ensure that the presence of thephenotype is detected. Preferably incidence times severity areconsidered together in a single factor for evaluation, and the separateratings during the life cycles of the individuals are separatelyevaluated.

As is known to the art, the effects of factors other than genotype onthe ratings may be accounted for by appropriate experimental designs asis known to the art.

Preferably, the data with respect to RFLP probe alleles and observationof the trait being studied is analyzed by multiple regression by leapsand bounds ("leaps"), as described in Furnival, G. M. and Wilson, Jr.,R. W. (1974),l "Regression by leaps and bounds," Technometrics16:499-511, to determine a subset of probes accounting for a maximumamount of phenotypic variation, followed by multiple standard regressionas is known to the art to determine the relative contribution of eachprobe to the phenotype.

Multiple regression by leaps and bounds requires a high degree ofcomputer capacity, and the method may need to be adapted, as discussedin the Examples hereof, to the available computer capacity. It isassumed that the presence of each donor allele in the DNA at a givenlocus contributes an equal amount to the presence of the phenotype.

The "leaps" analysis results in a manageable number of loci andassociated primary probes, which account for the maximum presence of thetrait, and are therefore said to be linked to the trait. By examiningthe successive sets of marker loci chosen by "leaps," flanking probesmay be identified which are associated with the trait at each locus, butnot as closely as the primary probes.

The multiple regression is performed on the smaller set of primaryprobes identified by "leaps." This analysis shows the relativecontribution of each marker locus to the total explained phenotypicvariance, compares the degree of explained variance across differenttimes of rating, and generates data ("residuals") whose magnitude anddistribution may be used to determine epistasis.

When a normal quantile quantile plot of residuals shows deviation fromthe expected straight line, rising steeply at the high end, and loweringsteeply at the low end, indicating that the trait is markedly morepronounced than a simple additive effect for each marker locus allelewould indicate when maximal presence of the trait is predicted by thepresence of appropriate marker alleles, and markedly less pronouncedthan expected when the relative absence of the trait is predicted by thepresence of appropriate marker alleles, epistasis is suspected.Examination of the effects of the presence or absence of particularalleles at particular loci when alleles at one or more additional lociare present or absent, for example as shown in the Figures, shows whichloci and combinations of loci ar most effective in accounting for thetrait. Probes at these marker loci may then be preferentially selectedfor use in manipulation of the trait in progeny populations.

RFLP analysis of progeny selected by backcrossing and selfing forhomozygosity of the desired trait along with maximum presence of theelite genotype will identify those individuals with DNA governing thetrait but a minimum of surrounding donor DNA. Individuals who areheterozygous or homozygous for recipient parent alleles at flankingmarker loci, preferably those which are homozygous for recipient parentalleles at such flanking sites, are selected for further breeding.Without the use of the RFLP technology described herein, it would bevirtually impossible to identify rare individuals having the trait butminimal surrounding donor DNA when donor DNA tends to move in clumps, asdoes the B68 DNA used in the examples hereof.

The following Examples are provided by way of illustration and not inlimitation of this invention. As will be apparent to those skilled inthe art, alternative means exist for accomplishing many of the stepsdescribed in the examples and may be substituted therefor.

EXAMPLES Example 1

Genetic Stocks and Breeding Scheme

The inbreds B68HtHt and B73 HtHtrhmrhm, originally released by IowaState University, Ames, Iowa, were used in this experiment. The genedesignations Ht and rhm indicate that the accessions used are resistantto Helmithosporium turicum race I, and Drechslera maydis, race (formerlyHelmithosporium maydis). B68HtHt, a known source of Maize Dwarf MosaicVirus resistance (Mikel, M. A. et al. (1984), "Genetics of resistance oftwo dent corn inbreds to maize dwarf mosaic virus and transfer ofresistance into sweet corn," Phytopathology 74(4):467), was used as thefemale in the initial cross with B73, an MDMV susceptible line. The F1was then selfed to produce F2 seeds and 157 F2 plants were selfed in thegreenhouse resulting in 109 F3 progeny lines.

F3 Progeny lines were tested for MDMV resistance at two locations(Farmington, Minn. and Madison, Wis.) using two blocks per location in arandomized complete block design. The parental lines, a susceptiblesweet corn hybrid (Jubilee), and a resistant dent corn hybrid (8100,Jacques seed) were included in each block.

Remnant seed from the most resistant F3 progeny line was thenbackcrossed to B73HtHtrhmnrhm female. The seed from three backcrossplants was bulked, planted out and selfed at Madison. Four seeds fromeach S1 ear were bulked and selfed again. 120 intact S2 ears wereselected at random from approximately 300 ears obtained. The S2 seed wastested for MDMV resistance at Lincoln, Ill. and Madison Wis. using abalanced incomplete block design with 34 entries per incomplete block,and four replications of each incomplete block. Each incomplete blockincluded 30 progeny lines, a resistant dent corn check (LH151), thesusceptible check Jubilee, and the original parental lines.

Example 2

Molecular methods 2.1 DNA extraction:

Second ear husk tissue, harvested at the silking stage, was used toisolate F2 DNA samples. For S2 progeny, leaf tissue samples from 12field-grown plants at the 3-5 leaf stage were pooled and the DNAextracted.

Crude nuclei were nuclei by a modification of Murray, M. G. and Kennard,W. C. (1984), "Altered chromatin conformation of the higher plant genephaseolin," Biochemistry 23:4225. Nuclei extraction buffer contained 20mM Pipes (pH7), 3 mMMgCl₂, 0.5M hexylene glycol, 10 mMorthophehanthroline, 10 mM sodium metabisulfite and 200 μMaurintricarboxylic acid. Crude nuclear pellets (500×g, 10 min.) werelysed in 15 mM EDTA, 0.7M NaCl, 0.5% cetyltrimethyl ammonium bromide and10 μg/ml proteinase K for 1 hour at 65° C. Insoluble material wasremoved by centrifugation (10,000×g 10 min.) and the DNA precipitated byaddition of ammonium acetate and isopropanol to final concentrations of1.25M and 50% respectively. DNA was dissolved in DNA dialysis buffercontaining 2 μg/ml RNAse A and incubated several hours at 37° C. Afterphenol extraction, the DNA was reprecipitated with isopropanol, rinsedand dissolved in DNA dialysis buffer. DNA concentrations were determinedfluorometrically (Murray, M. G. and Paaren, H. E. (1986), "Nucleic acidquantitation by continuous flow fluorimetry," Anal. Biochem.154:638-642.

2.2 Electrophoresis and Blotting:

Five μg of restricted DNA was typically loaded into 2.7 mm wide lanescast in 0.75% agarose gels made in 100 mM Tris-acetate (pH 8.3), and 2.5mM EDTA. Electrophoresis was at 1 volts/cm for 15-18 hours. Gels werestained for 30 min. in 0.1 μm/ml ethidium bromide prior to photographyand UV nicking. A short wave UV dose of 1400 μW/cm² (one min. from one15 watt germicidal bulb at a distance of 6 cm) was sufficient tointroduce 1 nick per 3-4 kb and optimize transfer from the gel. We foundUV nicking to be faster and more easily controlled than aciddepurination. The gel was denatured in 150 mM NaOH and 3 mM EDTA for 20minutes, rinsed briefly in distilled water and neutralized for 20minutes in 150 mM sodium phosphate buffer (pH 7.8). Gels weretransferred onto Genetran 45 or Zetabind membranes by capillary blottingusing 10 mM sodium pyrophosphate (pH 9.8) as the transfer buffer. Themembranes were soaked for at least 10 minutes in sodium pyrophosphateprior to transfer and dried thoroughly following transfer. Membraneswere blocked for 2 to 3 hours at room temperature in 2% SDS, 0.5% BSAand 1 mM EDTA prior to their first use.

2.3 Probe Preparation and Hybridization:

RNA marker loci prepared with the Riboprobe (Promega, Madison Wis.)system to a specific activity of about 8 to 1.2×10⁸ cpm/μg were usedthroughout this study. Plasmids were prepared according to Kieser, T.(1984), "Factors affecting the isolation of CCC DNA from Streptomyceslividans and Escherichia coli," Plasmid 12:19-36, and linearized toprevent transcription into the vector.

Blots were prehybridized overnight at room temperature in 100 mM sodiumphosphate buffer (pH 7.8), 20 mM sodium pyrophosphate, 5 mM EDTA, 1 mMorthophenathrolinhe, 0.1% SDS, 500 μg/ml heparin sulfate 10% dextransulfate, 5 μg/ml poly(C), 50 μg/ml herring Sperm DNA. Probe was added toa final concentration of 2-500,000 cpm/ml. It was frequently possible tomix 3 marker loci at a time once the migration of each band was known.After 6 hours at 65° C. blots were rinsed in excess wash buffer (20 mMNaPB (pH 8.6), 5 mM NaPPi, 1 mM EDTA and 0.1% SDS) for 30 minutes at 65°C. Blots were incubated in RNAse solution (50 ng/ml RNAse A in 300 mMNaCl, 5 mM EDTA and 10 mM Tris-HCl (pH 7.5)) for 15 minutes at roomtemperature followed by the addition of proteinase K and SDS to 10 μg/mland 0.1% respectively and incubation for 15 minutes at room temperature.Blots were given two final 15 minute washes in half strength wash bufferat 65° C. Blots were autoradiographed on Kodak XAR 5 film using oneDuPont Cronex Lightning Plus intensifying screen at -80° C.

Example 3

Virus inoculation

Stocks of MDMV-A or MDMV-B were obtained from Jacques Seed Co.,Prescott, Wis. in the form of infected sorghum plants. Stocks weresubsequently verified by their ability to grow on sudan grass or Johnsongrass (Compendium of Corn Diseases, 2nd Edition (1980) (Shurtieff, M. C.ed.), 61-63). To prepare sufficient inoculum for field experiments, 100g of sorghum leaf tissue was homogenized in 600 mls ice cold 0.1Mpotassium phosphate buffer (pH 7.4) using a Cuisinart food processor.Debris was removed by filtration through cheesecloth and 0.01 g/ml ofcorrundum (#22 Mm) was added prior to immediate application with sprayerat 60 psi. Virus was amplified on the Jubilee variety of sweet corn(source: Rogers Bros. Seed Company), a line especially sensitive toMDMV. Twenty-five four-leaf stage plants were inoculated with MDMV-A and25 with MDMV-B in the greenhouse. Inoculation was repeated two dayslater. After six weeks, large quantities of field inoculum were preparedas above from equal weights of MDMV-A and MDMV-B infected Jubileetissue.

F3 progeny lines were inoculated twice, five days apart, at the 3-5 leafstage with the mixed MDMV-A and MDMV-B inoculum. Plants were scored forincidence and severity 2, 4 and 6 weeks later. Incidence was calculatedas number of plants infected over number of plants in the row. Thecriterion for presence of virus was the characteristic mosaic symptom onany leaves, regardless of extent. Every leaf on each plant wasinspected, except for the last rating, in which leaves at eye level orbelow were examined. Severity was rated on scale of 1-4 where 1 was anisolated streak of mosaic following the venation (1 streak/leaf on notmore than two leaves), and 4 was a severely chlorotic, dwarfed plantwith mosaic present on all visible leaves. A rating of two or moreindicated systemic disease.

Because of the scale of the S2 field experiment and the need to ensureadequate and consistent infectivity, inoculum was prepared on site (i.e.in the field) from potted infected plants. S2 plants were inoculatedonce at the 3-5 leaf stage. Inoculation at the Madison site was donethree days after tissue was taken for DNA extraction. S2 plants wererated 2, 4, 6 and 8 weeks after inoculation for both incidence andseverity. Given ratings were completed in one day, and each rating wasbegun at different starting points to minimize the effect of humanfatigue.

Example 4

Statistical analysis

Statistical analysis used UNIX and S software installed on a Pyramidmodel 90X computer (Mountain View, Calif.). Fifteen F3 genotypes wereeither lost or discarded due to poor stand (<50% of seeds planted), lossof F2 DNA sample, or insufficient F2 DNA. The statistical analysisdescribed below included the 93 genotypes retained in the initialexperiment.

Both field experiments were analyzed as two levels factorials (genotypeand location) with repeated measures (time), after block or incompleteblock effects were accounted for. Incidence data was transformed usingarsin of sqrt of p to stabilize the variance.

The assessment of genetic linkage was done using the classical method ofphenotypic categories as described by Mather, K. (1931), "Themeasurement of linkage in heredity," Methuen & Co., London, withadditional orthoganal coefficients added to account for the 9-cellclassification expected for the comparison of two codominant markers.The method of maximum likelihood (Allard, R. W. (1956), "Formula ETables to Facilitate the Calculation of Recombination Values inHeredity," Hilgardia 235-278, was used to calculate linkage.

The phenotypic data (Y data) used were the disease scores for incidenceand incidence x severity for each rating done at 2, 4 and 6 weeks afterinoculation. The incidence and incidence x severity data within a givenrating were considered to be separate factors, while each rating in timewas kept separate, thus yielding a total of six sets of Y values. As thedesirable trait would yield a number close to zero, all loci homozygousfor the B68 morphs were coded to the number zero, while the locihomozygous for the B73 morphs were coded to the number 2, andheterozygotes were then coded as 1. Marker loci which yielded five ormore missing values were dropped (5 of 76). For those marker lociyielding 4 or fewer missing values, the value of any missing data wasestimated by using the value of the marker locus most closely linked tothe marker locus with missing data for the individual in question. Therewere a total of 44 estimated missing values in a data set composed of6603 values (93 F3 individuals and 71 marker loci).

The potential association between the dependent variable Y (the F3disease rating) and the independent variable X (the set of morphs for agiven marker locus) was initially assessed using Mallows' method ofmultiple regression by leaps and bounds (Furnival, G. M. and Wilson,Jr., R. W. (1974), "Regressions by leaps and bounds," Technometrics16(4):499-511) in which the criteria for subset selection is based onthe test statistic Cp. The calculation of Cp results in a trade-offbetween maximizing the predictive value of the model while minimizingthe number of variables in the selected subset (Weisberg, S. (1985),"Applied Linear Regression," (2nd edition)). The calculations whichgenerate the subset values utilize two algorithms from a larger setwhich if used together compute the residual sums of squares for allpossible regressions. These two algorithms can be combined to form aleap operation for finding the best subsets without examining allpossible subsets.

After leaps and bounds was done on each set of Y data, the marker lociselected were reanalyzed using the standard multiple regression. Themultiple regression analysis was used to compare the relativecontribution of each marker locus to the total explained phenotypicvariance, to compare the degree of explained variance (the multiple R²value) across different times of rating, and to examine the magnitudeand distribution of residuals.

The phenotypic scores of S2 population were handled in the same way asdescribed above. The phenotypes of the S2 population were assessed bymultiple regression using the set of marker loci selected, for all fourdisease ratings for incidence and incidence x severity.

Comparisons of the expected versus observed genotypes in the F2 weredone using the Chi-square goodness of fit statistic. Calculation ofallele frequency in the S2 was done using the Hardy-Wineberg expectation(p² +2pq +q²) where p² was the observed frequency of B68 homozygotes ata given locus, and q² the observed frequency of B73 homozygotes.

Example 5

Field data

The anova for the field data revealed no significant differences betweenblocks at locations or between locations, for either incidence (I) dataor incidence x severity (S) data. The differences between the times ofrating, however, were highly significant (p<0.001 I and S data).Examination of mean scores by genotype for each rating revealed ageneral tendency for incidence and severity to increase slightly withtime. However, eight F3 progeny lines showed a decrease in incidencebetween the first and last ratings of 15% of more, while in 12 linesincidence increased by 15% or more. The most dramatic drop occurred inline 117, in which the initial incidence of 0.24 dropped to 0.04. Withthe exception of 117, the severity ratings for those lines in whichincidence decreased indicated that some plants in the row developedsystemic infection, while others appeared to "outgrow" or contain thevirus. The donor parent B68, line 117, and line 141 were the onlyentries in which no plants developed systemic disease. Line 141 waschosen as the donor for the backcross to B73 on the basis ofconsistently low incidence (0.09, 0.14, 0.09) and severity (1.2, 1.0,1.4) ratings. At the time of this choice, the genotype of the F2 plant"141" which produced this line was unknown. We made the assumption thatF2 plant "141" must have been homozygous for the majority if not all ofthe resistance genes from B68 in order to have produced a line which wasat least as resistant as B68 itself in the year in which the ratingswere done.

Example 6

Prediction

We chose to use a linear regression approach to identifying marker locilinked to genes contributing to MDMV resistance. The restrictionfragment length polymorphisms were considered the independent variable(X data). The genotype homozygous for the B68 morphs at a given markerlocus was scored as "0," the heterozygous genotype as "1" and thegenotype homozygous for the B73 morphs as a "2." This method ofweighting genotypes assumes that each B73 allele at each locus gives one"hit" of susceptibility, and assumes no interactions between differentloci. The effect of potential recombination was not considered otherthan in the implicit sense that the observed phenotypic variabilitywould be best accounted for by those loci which were most tightly linkedto resistance genes. Our computing capacity was such that 71 probescould not be evaluated simultaneously. We constructed a computer programwhich performed linear regression by leaps and bounds using 20 probes atonce. To reduce potential bias due to the order in which the groups ofmarker loci were analyzed, the program made recursive assessments of thedata, beginning with the first 20 probes, proceeding to probes 5- 25,10-30, 15-35, etc, until the remaining number of probes was less than15. These last probes were then combined with the first five in the set,and the final recursive regression done. The order of the 71 probes wasthen randomized and the analysis was repeated. Each time leaps wasperformed, the program saved the best ten probe combinations. All of themarker loci subsets selected by leaps were again presented to leaps, andassessed recursively as before. The subset selections from this analysiswere then combined and run again until a single subset remained. Thisentire process was done on each of three sets of marker loci data. Eachset contained the same data, but the order of the data was randomizedwithin each set. The dependent variable Y consisted of six separate datasets; the three time ratings for the I data and the three time ratingsfor the S data. Regression by leaps and bounds was performed asdescribed above for each set of Y data.

Upon completion of the analyses, the nine sets of marker loci for the Idata (three time ratings for each of three randomized sets of X data)and the nine sets of marker loci for the S data were compared. Thosemarker loci which were chosen in all three data sets for each timerating for I data and S data were compared (Table 1). From thiscomparison the marker loci r179, gp144, c262, c512, c329, r271, r250,r189, c92b, c926, r324 and r248a were chosen for further investigation.

The first set of markers tested did not include markers r271, r189,r250, r324, and r248a (Table 2). The marker loci chosen accounted for93-95% of the observed phenotypic variance for incidence, and 91-93% ofthe observed phenotypic variance for incidence times severity. A test ofthe relative contribution of r250 versus c512 was

                  TABLE I                                                         ______________________________________                                        Markers chosen by "Leaps"                                                     Three random data sets for each disease rating                                One set of Incidence (INC) data composed of three ratings, each               of which has three subsets of markers chosen by leaps using the               same group of marker loci, but analyzed in different orders to                attenuate bias due to the order in which markers are evaluated.               Similarly for the Incidence X Severity (INC X SEV) data.                      Markers chosen only once in each set of three per each rating are             deleted.                                                                      Markers chosen only within a single rating are deleted.                       Markers chosen only within INC set or INC X SEV set are                       shown if above criteria are met.                                              ______________________________________                                        INC 1                                                                         r179 gp144 c512    c329 r262   r271                                           r179 gp144 c512    c329 r262 r92b*                                                                           r271 r189 r250 r324                            r179 gp144 c512    c329r92b    r271 r189 r250 r324                            INC 2                                                                         r179 gp144 c512    c329 r262 r92b                                                                            r271                                           r179 gp144 c512    c329 r262 r92b                                                                            r271 rl89 r250                                 r179 gp144         c329 r262   rl89 r250                                      INC 3                                                                         r179 gp144 c512                                                                         c926     r262 r92b   r271r250                                       r179 gp144 c512                                                                         c926     c329 r262 r92b                                                                            r271 rl89 r250 r324                            r179 gp144 c512    c329r92b    r189r324                                       INC X SEV 1                                                                   gp144              r262        r189 r250                                      gp144     c587 c926                                                                              r262        r189 r250                                      gp144     c587 c926                                                                              r262        r189 r250                                      INC X SEV 2                                                                   r179 gp144                                                                              c587     r262 r92b   r271r250 r248a*                                r179 gp144                                                                              c587     r262 r92b   r271r250 r248a                                 r179 gp144                                                                              c587     r262 r92b   r271r250 r248a                                 INC X SEV 3                                                                   r179 gp144 c512                                                                         c587 c926                                                                              r92b        r248a                                          r179 gp144                                                                              c587     c329r92b    r189r248a                                      r179 gp144 c512                                                                         c587 c926                                                                              c329r92b    r189                                           ______________________________________                                         Inspection of previously determined linkage data reveals that of the          markers selected above, the following are linked pairs: r179-r271,            gp144-r189, c587-c512-r250.                                                   *a and b designations indicate the probe was found to map to more than on     locus on the genome.                                                     

                  TABLE 2                                                         ______________________________________                                        Multiple Regression Analysis of Eight Probes                                  Most Consistently Chosen by Leaps and Bounds                                  Across Times of Rating                                                        (Flanking Markers not Included)                                                         Coef        Std Err     t Value                                     ______________________________________                                        Regression on first rating for Incidence                                      r179      0.1920859   0.03979275  4.827157                                    gp144     0.2035315   0.04309700  4.722638                                    c926      0.0841703   0.04116961  2.044477                                    c329      0.1418603   0.04016558  3.531886                                    c587      0.0739064   0.05554563  1.330554                                    c512      0.0933097   0.05278113  1.767861                                    r262      0.1148176   0.04179766  2.746986                                    r92b      0.1150422   0.03853215  2.985616                                    Residual Standard Error = 0.2660565                                           Multiple R Square = 0.949127                                                  N = 95 F Value = 202.8943 on 8, 87 df                                         Regression on second rating for Incidence                                     r179      0.1795560   0.04350730  4.127032                                    gp144     0.2010054   0.04712000  4.265820                                    c926      0.0948380   0.04501268  2.106917                                    c329      0.1550757   0.04391493  3.531274                                    c587      0.0639720   0.06073067  1.053371                                    c512      0.1139893   0.05770811  1.975274                                    r262      0.1092032   0.04569936  2.389599                                    r92b      0.1428178   0.04212903  3.390010                                    Residual Standard Error = 0.2908921                                           Multiple R Square = 0.944296                                                  N = 95 F Value --  184.3544 on 8, 87 df                                       Regression on third rating for Incidence                                      r179      0.1661475   0.04313687  3.851635                                    gp144     0.1767770   0.04671880  3.783851                                    c926      0.1046542   0.04462944  2.344960                                    c329      0.1427865   0.04354103  3.279355                                    c587      0.0925554   0.06021359  1.537118                                    c512      0.0967427   0.05721676  1.690810                                    r262      0.0876406   0.04531026  1.934232                                    r92b      0.1584079   0.04177032  3.792355                                    Residual Standard Error = 0.2884154                                           Multiple R-Square = 0.941025                                                  N = 95 F Value = 173.5252 on 8, 87 df                                         Regression on first rating for Incidence × Severity                     r179      0.4811642   0.1010414   4.762049                                    gp144     0.4055680   0.1094315   3.706134                                    c926      0.2045853   0.1045375   1.957051                                    c329      0.2108138   0.1019881   2.067043                                    c587      0.2204143   0.1410410   1.562768                                    c512      0.1583903   0.1340214   1.181829                                    r262      0.1601508   0.1061323   1.508974                                    r92b      0.1410394   0.0978405   1.441523                                    Residual Standard Error = 0.675568                                            Multiple R-Square = 0.915254                                                  N = 95 F Value = 117.4491 on 8, 87 df                                         Regression on second rating for Incidence × Severity                    r179      0.4449658   0.1035316   4.297872                                    gp144     0.3822563   0.1121286   3.409090                                    c926      0.2311180   0.1071139   2.157684                                    c329      0.2125058   0.1045017   2.033516                                    c587      0.3336661   0.1445170   2.308835                                    c512      0.1345009   0.1373244   0.979439                                    r262      0.1642594   0.1087479   1.510460                                    r92b      0.1652890   0.1002518   2.646226                                    Residual Standard Error = 0.6922181                                           Multiple R-Square = 0.923701                                                  N = 95 F Value = 131.6558 on 8, 87 df                                         Regression on third rating for Incidence × Severity                     r179      0.5146590   0.1072424   4.799024                                    gp144     0.3747488   0.1161475   3.226491                                    c926      0.2349898   0.1109531   2.117920                                    c329      0.2335804   0.1082472   2.157841                                    c587      0.2487058   0.1496968   1.661396                                    c512      0.1687855   0.1422464   1.186571                                    r262      0.0577514   0.1126457   0.5126821                                   r92b      0.3200847   0.1038451   3.082320                                    Residual Standard Error = 0.717029                                            Multiple R-Square = 0.918324                                                  N = 95 F Value = 122.2729 on 8, 87 df                                         ______________________________________                                    

done by substituting the former for the latter and repeating themultiple regression. Although the multiple R² values were notsignificantly different (0.9446, r250 vs. 0.9443, c512), the partialregression coefficients of c512 were consistently, although slightlyhigher. From this results we concluded that the gene of interestprobably lay between c512 and r250. A similar approach was used for ther179-r271 pair and the gp144-r189 pair. As r206, the closest marker tor179 on side opposite to that of r271, was not included in the finalassessment by leaps and bounds, we concluded that the gene of interestwas between r179 and r271, and closer to r179. Although no marker wasavailable for gp144 on the side opposite to that of r189, the magnitudeof the partial regression coefficients associated with gp144 and r189indicated that these loci marked the segment in which the gene ofinterest was located. The relative contributions of r324 and r248a wereassessed by adding each, one at a time, to the list shown in Table 2.The multiple R² values were not significantly increased, and the partialregression coefficients indicated minimal positive effects. As thepurpose of the experiment was to predict resistance in a progenypopulation using the minimum number of markers for the best possibleprediction, these two markers were not included in the set which wasused for prediction of phenotype in the S2 progeny.

The coefficients of partial regression revealed that the relativeimportance of each marker locus changed somewhat across different ratingtimes. The partial regression coefficients express the average change instandard deviation units of the Y data for one standard deviation unitof marker locus under consideration when the effect of all the otherloci are kept constant (Sokal, R. R. and Rohlf, F. J. (1981) Biometry(2nd edition). The partial regression coefficient of r179 for the firstrating of the S data for example, is interpreted to mean that for thosegenotypes having the same score for each of the other loci (all zeros,or all ones or all twos), an increase of one standard deviation in thevalue of r179 (an increase towards B73 morphs and away from B68 morphs)results in an increase of the S data score by 15% of its standarddeviation. The total effects of all the partial regression coefficientsare not necessarily additive because the X values or the marker locivalues are correlated with each other. The magnitude of interdependencecase may be calculated by dividing the standard error shown by thestandardized unexplained variance (1-R²)/(n-k-1), where R² is themultiple R² value, n is the population size, and k is the number ofvariables. The number thus obtained (i.e. 0.0435/((1-0.9443)/86)) =67.16for r179, second incidence rating), is the variance inflation factor(Marquardt, D. W. (1970), "Generalized inverses, ridge regression,biased linear estimation, and nonlinear estimation," Technometrics12:591-612), and represents the factor by which the unexplained varianceis inflated due to intercorrelation among the independent variables. Thevariance inflation factor will equal unity if the X variables areuncorrelated. Although the VIFs do not indicate how the intercorrelationobtains, the evidence of lack of independence between the variables inan additive genetic model suggests a degree of epistatic interaction.Normal quantile quantile plots of residuals showed excellent fit to alinear model within the moderately resistant to the moderatelysusceptible genotypes, but significant departure from linearity wasobserved for both the most resistant and the most susceptible genotypes.The pattern of these deviations also suggested an interaction betweenone or more of the marker loci.

An examination of mean scores for disease by genotype indicated that atleast one of the two B68 alleles for r179 must be present for any of theother marker loci, except gp144 to affect resistance (FIGS. 1-7). Themarker locus gp144 also appeared to interact with r179, but a mildeffect on resistance was seen, even if the B68 morphs for r179 areabsent. As the analyses above indicated that the loci of interestprobably were within the r179-r271, gp144-r189, and c512-r250 segments,we examined those genotypes which had two B68 alleles for each locus ateither end of the segment (4 total) vs. one B68 allele at either end (2total) vs. no B68 alleles (FIGS. 8, 9). The effect of tracking ther179-r271 segment with the gp144-r189 segment did not dramaticallyaffect the resistance associated with the genotype (compare FIG. 8 withFIG. 1). However, tracking all three segments showed that homozygosityfor all three segments was clearly associated with a high level ofresistance (FIG. 12). It is also clear that the marker segment r179-r271is not of itself associated with resistance. Although these data arecomposed of small numbers of individuals (compare bar charts to data,Table 2), the excellent association between genotype and phenotypeindicated that the markers, and marker-bounded segments were potentiallyuseful for the prediction of resistance in S2 progeny. From theseanalyses we concluded that the first criteria in resistance predictionwas the presence of one, and preferably two B68 alleles for the r179marker locus. Once this criterion was met, then those individuals havingthe maximum number of B68 alleles for gp144, c512, c329, and r262,respectively, would be expected to be resistant. The ordering of themarker loci was determined by a relative contribution to total R² valuesin both I and S data, and apparent magnitude of interaction with r179.We would also expect to see an improvement in the result if markedsegments were included, although the effects of recombination couldresult in resistant individual which were homozygous for the markerlocus and heterozygous for the flanking marker.

Example 7

Verification of Prediction

The data from the Lincoln location were not used in the test of theprediction. The disease differential between B73 and B68 (≈0.4-0.5 for Idata and ≈1.0 for S data) was lower than expected, and examination ofvariance between balanced incomplete blocks showed unacceptabledifferences between disease scores for the same genotype, especially forthose genotypes in the moderately susceptible to moderately resistantrange. Infection was more severe at Madison and balanced incompleteblocks received similar ratings (p<0.05). As in the earlier data, theeffect of the time of rating was significant (p<0.001). All four ratingswere examined separately.

Multiple regressions of the predictor set on each of the eight sets ofratings showed that although the multiple r² values were somewhat lowerthan obtained when the model was fit, the accounting for Y was very good(Table 3). The lower multiple R² values for incidence were notunexpected because of the absence of marker loci c512 and c587 whichwere not readable. Examination of the effect of r179 in the 399 S2progeny clearly confirm that r179 is essential for resistance potential,and supports the results of Mikel et al. Supra in which an epistaticgene was indicated. The effect of the other probes appeared to beprimarily

                  TABLE 3                                                         ______________________________________                                        Multiple Regression Analysis of 7 Marker Loci                                 Predicted to be Involved in                                                   Maize Dwarf Mosaic Virus Resistance,                                          and One Flanking Marker (r250)                                                Regression of 399 disease ratings against marker loci                                 Coef     Std Err     t Value                                          ______________________________________                                        Regression on first rating for Incidence                                      r179      2.117635e-1                                                                              0.0638428   3.316953                                     gp144     9.547554c-2                                                                              0.0791312   1.206546                                     r250      2.585734e-1                                                                              0.0365023   7.083759                                     c926      1.301470e-4                                                                              0.0632003   0.002059277                                  c329      1.513948e-1                                                                              0.0456896   3.313552                                     r262      1.276097e-1                                                                              0.0647424   1.971037                                     r92b      4.639700e-2                                                                              0.0620268   0.7480156                                    Residual Standard Error = 0.3604575                                           Multiple R-Square = 0.857966                                                  N = 104 F Value = 83.7051 on 7, 97 df                                         Regression on second rating for Incidence                                     r179      0.1857877  0.06251866  2.971716                                     gp144     0.1106930  0.07749007  1.428480                                     r250      0.2558244  0.03574523  7.156884                                     c926      0.003455432                                                                              0.06188958  0.0558322                                    c329      0.1632299  0.04474197  3.648251                                     r262      0.09613084 0.06339965  1.516267                                     r92b      0.07187376 0.06074035  1.183295                                     Residual Standard Error = 0.352982                                            Multiple R-Square = 0.862342                                                  N = 104 F Value = 86.8066 on 7, 97 df                                         Regression on third rating for Incidence                                      r179      0.2437292  0.0603624   4.037764                                     gp144     0.1212962  0.0748175   1.621228                                     r250      0.2502463  0.0345124   7.250910                                     c926      0.01593832 0.0597550   0.2667276                                    c329      0.1576393  0.0431989   3.649155                                     r262      0.1028069  0.0612130   1.679494                                     r92b      0.06662901 0.0586454   1.136132                                     Residual Standard Error = 0.3408075                                           Multiple R-Square = 0.882785                                                  N = 104 F Value = 104.3624 on 7, 97 df                                        Regression on fourth rating for Incidence                                     r179      0.2937771  0.04858226  6.047005                                     gp144     0.1449855  0.06021630  2.407746                                     r250      0.1887696  0.02777705  6.795886                                     c926      0.03259441 0.04809340  0.677731                                     c329      0.09876792 0.03476828  2.840748                                     r262      0.08933747 0.04926686  1.813338                                     r92b      0.1141951  0.04720036  2.419369                                     Residual Standard Error = 0.2742964                                           Multiple R-Square = 0.914860                                                  N = 104 F Value = 148.9006 on 7, 97 df                                        Regression on first rating for Incidence × Severity                     r179      0.4174503  0.0734569   5.682928                                     gp144     0.1732581  0.0910477   1.902938                                     r250      0.2143294  0.0419992   5.103179                                     c926      0.04433341 0.0727177   0.6096642                                    c329      0.1473353  0.0525700   2.802649                                     r262      0.1932943  0.0744920   2.594832                                     r92b      0.02681083 0.0713674   0.3756731                                    Residual Standard Error = 0.4147391                                           Multiple R-Square = 0.879232                                                  N = 104 F Value 100.8846 on 7, 97 df                                          Regression on second rating for Incidence × Severity                    r179      0.3969871  0.0863667   4.596527                                     gp144     0.2423920  0.1070491   2.264307                                     r250      0.2462625  0.0493804   4.987045                                     c926      0.0657849  0.0854977   0.769435                                     c329      0.1838871  0.0618090   2.975083                                     r262      0.1955799  0.0873838   2.233060                                     r92b      0.1768786  0.0839101   2.107954                                     Residual Standard Error = 0.4876284                                           Multiple R-Square = 0.888229                                                  N = 104 F Value = 110.1209 on 7, 97 df                                        Regression on third rating for Incidence × Severity                     r179      0.5092413  0.07231843  7.041653                                     gp144     0.2694907  0.08963659  3.006481                                     r250      0.1928990  0.04134828  4.665225                                     c926      0.0751274  0.07159073  1.049401                                     c329      0.1621395  0.05175526  3.132813                                     r262      0.1660191  0.07333751  2.263767                                     r92b      0.1482765  0.07026136  2.110357                                     Residual Standard Error = 0.4083113                                           Multiple R-Square = 0.917977                                                  N = 104 F Value = 155.0855 on 7, 97 df                                        Regression on fourth rating for Incidence ×  Severity                   r179      0.5464476  0.06342066  8.616237                                     gp144     0.2575648  0.07860808  3.276569                                     r250      0.1329957  0.03626096  3.667739                                     c926      0.1278032  0.06278250  2.035650                                     c329      0.1037226  0.04538751  2.285267                                     r262      0.1340554  0.06431437  2.084378                                     r92b      0.2208268  0.06161670  3.583878                                     Residual Standard Error = 0.3580744                                           Multiple R-Square = 0.9334                                                    N = 104 F Value = 194.2056 on 7, 97 df                                        ______________________________________                                         The flanking marker to p512 is r250 (3.8 mu from c512 on side opposite to     c587)                                                                    

additive, as predicted, when the r179 marker locus has at least one, andpreferably two B68 alleles.

We claim:
 1. A nucleic acid probe designated gp144.